Hans van Kranenburg [Mon, 13 May 2019 19:56:10 +0000 (21:56 +0200)]
Commit patch queue (exported by git-debrebase)
[git-debrebase make-patches: export and commit patches]
Hans van Kranenburg [Mon, 13 May 2019 19:55:03 +0000 (21:55 +0200)]
Declare fast forward / record previous work
[git-debrebase pseudomerge: quick]
Hans van Kranenburg [Sun, 10 Feb 2019 17:26:45 +0000 (18:26 +0100)]
tools/xl/bash-completion: also complete 'xen'
We have the `xen` alias for xl in Debian, since in the past it was a
command that could execute either xl or xm.
Now, it always does xl, so, complete the same stuff for it as we have
for xl.
Signed-off-by: Hans van Kranenburg <hans@knorrie.org>
[git-debrebase split: mixed commit: upstream part]
Ian Jackson [Fri, 22 Feb 2019 12:24:35 +0000 (12:24 +0000)]
pygrub: Specify -rpath LIBEXEC_LIB when building fsimage.so
If LIBEXEC_LIB is not on the default linker search path, the python
fsimage.so module fails to find libfsimage.so.
Add the relevant directory to the rpath explicitly.
(This situation occurs in the Debian package, where
--with-libexec-libdir is used to put each Xen version's libraries and
utilities in their own directory, to allow them to be coinstalled.)
Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
Bastian Blank [Sat, 5 Jul 2014 09:47:01 +0000 (11:47 +0200)]
pygrub: Set sys.path
We install libfsimage in a non-standard path for Reasons.
(See debian/rules.)
This patch was originally part of `tools-pygrub-prefix.diff'
(eg commit
51657319be54) and included changes to the Makefile to
change the installation arrangements (we do that part in the rules now
since that is a lot less prone to conflicts when we update) and to
shared library rpath (which is now done in a separate patch).
(Commit message rewritten by Ian Jackson.)
Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
squash! pygrub: Set sys.path and rpath
Ian Jackson [Thu, 21 Feb 2019 16:05:40 +0000 (16:05 +0000)]
hotplug-common: Do not adjust LD_LIBRARY_PATH
This is in the upstream script because on non-Debian systems, the
default install locations in /usr/local/lib might not be on the linker
path, and as a result the hotplug scripts would break.
A reason we might need it in Debian is our multiple version
coinstallation scheme. However, the hotplug scripts all call the
utilities via the wrappers, and the binaries are configured to load
from the right place anyway.
This setting is an annoyance because it requires libdir, which is an
arch-specific path but comes from a file we want to put in
xen-utils-common, an arch:all package.
So drop this setting.
Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
Hans van Kranenburg [Sat, 9 Feb 2019 16:27:26 +0000 (17:27 +0100)]
sysconfig.xencommons.in: Strip and debianize
Strip all options that are for stuff we don't ship, which is 1)
xenstored as stubdom and 2) xenbackendd, which seems to be dead code
anyway. [1]
It seems useful to give the user the option to revert to xenstored
instead of the default oxenstored if they really want.
[1] https://lists.xen.org/archives/html/xen-devel/2015-07/msg04427.html
Signed-off-by: Hans van Kranenburg <hans@knorrie.org>
Acked-by: Ian Jackson <ijackson@chiark.greenend.org.uk>
Hans van Kranenburg [Thu, 3 Jan 2019 23:35:45 +0000 (00:35 +0100)]
vif-common: disable handle_iptable
Also see Debian bug #894013. The current attempt at providing
anti-spoofing rules results in a situation that does not have any
effect. Also note that forwarding bridged traffic to iptables is not
enabled by default, and that for openvswitch users it does not make any
sense.
So, stop cluttering the live iptables ruleset.
This functionality seems to be introduced before 2004 and since then it
has never got some additional love.
It would be nice to have a proper discussion upstream about how Xen
could provide some anti mac/ip spoofing in the dom0. It does not seem to
be a trivial thing to do, since it requires having quite some knowledge
about what the domU is allowed to do or not (e.g. a domU can be a
router...).
Hans van Kranenburg [Thu, 3 Jan 2019 21:03:06 +0000 (22:03 +0100)]
Fix empty fields in first hypervisor log line
Instead of:
(XEN) Xen version 4.11.1 (Debian )
(@)
(gcc (Debian 8.2.0-13) 8.2.0) debug=n
Thu Jan 3 19:08:37 UTC 2019
I'd like to see:
(XEN) Xen version 4.11.1 (Debian 4.11.1-1~)
(pkg-xen-devel@lists.alioth.debian.org)
(gcc (Debian 8.2.0-13) 8.2.0) debug=n
Thu Jan 3 22:44:00 CET 2019
The substitution was broken since the great packaging refactoring,
because the directory in which the build is done changed.
Also, use the Maintainer address from debian/control instead of the most
recent changelog entry. If someone wants to use the address to ask a
question, they will end up at the team mailing list, which is better
than an individual person.
Ian Jackson [Mon, 15 Oct 2018 11:11:32 +0000 (12:11 +0100)]
Revert "tools-xenstore-compatibility.diff"
Following recent discussion in pkg-xen-devel and xen-devel,
https://lists.xenproject.org/archives/html/xen-devel/2018-10/msg00838.html
I am dropping this patch.
For now I revert it. When we next debrebase, we can (if we like)
throw away both the original patch, and this revert.
This reverts commit
5047884c76849b67e364bc525d1b3b55e781cf16.
Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
Ian Jackson [Fri, 12 Oct 2018 16:56:56 +0000 (17:56 +0100)]
docs/man/xen-vbd-interface.7: Provide properly-formatted NAME section
This manpage was omitted from
docs/man: Provide properly-formatted NAME sections
because I was previously building with markdown not installed.
Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
Ian Jackson [Fri, 12 Oct 2018 17:56:04 +0000 (17:56 +0000)]
tools/firmware/Makefile: CONFIG_PV_SHIM: enable only on x86_64
Previously this was *dis*abled for x86_*32*. But if someone should
run some of this Makefile on ARM, say, it ought not to be built
either.
Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>
Ian Jackson [Fri, 12 Oct 2018 17:17:10 +0000 (17:17 +0000)]
shim: Provide separate install-shim target
When building on a 32-bit userland, the user wants to build 32-bit
tools and a 64-bit hypervisor. This involves setting XEN_TARGET_ARCH
to different values for the tools build and the hypervisor build.
So the user must invoke the tools build and the hypervisor build
separately.
However, although the shim is done by the tools/firmware Makefile, its
bitness needs to be the same as the hypervisor, not the same as the
tools. When run with XEN_TARGET_ARCH=x86_32, it it skipped, which is
wrong.
So the user must invoke the shim build separately. This can be done
with
make -C tools/firmware/xen-dir XEN_TARGET_ARCH=x86_64
However, tools/firmware/xen-dir has no `install' target. The
installation of all `firmware' is done in tools/firmware/Makefile. It
might be possible to fix this, but it is not trivial. For example,
the definitions of INST_DIR and DEBG_DIR would need to be copied, as
would an appropriate $(INSTALL_DIR) call.
For now, provide an `install-shim' target in tools/firmware/Makefile.
This has to be called from `install' of course. We can't make it
a dependency of `install' because it might be run before `all' has
completed. We could make it depend on a `shim' target but such
a target is nearly impossible to write because everything is done by
the inflexible subdir-$@ machinery.
The overally result of this patch is that existing make invocations
work as before. But additionally, the user can say
make -C tools/firmware install-shim XEN_TARGET_ARCH=x86_64
to install the shim. The user must have built it already.
Unlike the build rune, this install-rune is properly conditional
so it is OK to call on ARM.
What a mess.
Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>
Ian Jackson [Fri, 12 Oct 2018 16:00:16 +0000 (16:00 +0000)]
tools/firmware/Makfile: Respect caller's CONFIG_PV_SHIM
This makes it easier to disable the shim build. (In Debian we need to
build the shim separately because it needs different compiler flags
and a different XEN_COMPILE_ARCH.
Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>
Ian Jackson [Fri, 5 Oct 2018 17:05:48 +0000 (18:05 +0100)]
.gitignore: Add configure output which we always delete and regenerate
Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
Ian Jackson [Wed, 3 Oct 2018 15:25:58 +0000 (16:25 +0100)]
autoconf: Provide libexec_libdir_suffix
This is going to be used to put libfsimage.so into a path containing
the multiarch triplet.
Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
Ian Jackson [Fri, 5 Oct 2018 16:53:38 +0000 (17:53 +0100)]
tools-libfsimage-prefix.diff
Patch-Name: tools-libfsimage-prefix.diff
Gbp-Pq: Topic prefix-abiname
Gbp-Pq: Name tools-libfsimage-prefix.diff
Bastian Blank [Sat, 5 Jul 2014 09:46:47 +0000 (11:46 +0200)]
tools-libfsimage-abiname.diff
Patch-Name: tools-libfsimage-abiname.diff
Gbp-Pq: Topic prefix-abiname
Gbp-Pq: Name tools-libfsimage-abiname.diff
Ian Jackson [Thu, 20 Sep 2018 17:10:14 +0000 (18:10 +0100)]
Do not build the instruction emulator
Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
Ian Jackson [Tue, 1 Nov 2016 16:20:27 +0000 (16:20 +0000)]
tools/tests/x86_emulator: Pass -no-pie -fno-pic to gcc on x86_32
The current build fails with GCC6 on Debian sid i386 (unstable):
/tmp/ccqjaueF.s: Assembler messages:
/tmp/ccqjaueF.s:3713: Error: missing or invalid displacement expression `vmovd_to_reg_len@GOT'
This is due to the combination of GCC6, and Debian's decision to
enable some hardening flags by default (to try to make runtime
addresses less predictable):
https://wiki.debian.org/Hardening/PIEByDefaultTransition
This is of no benefit for the x86 instruction emulator test, which is
a rebuild of the emulator code for testing purposes only. So pass
options to disable this.
These options will be no-ops if they are the same as the compiler
default.
On amd64, the -fno-pic breaks the build in a different way. So do
this only on i386.
Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
Gbp-Pq: Topic misc
Gbp-Pq: Name toolstestsx86_emulator-pass--no-pie--fno.patch
Bastian Blank [Sat, 5 Jul 2014 09:47:29 +0000 (11:47 +0200)]
Remove static solaris support from pygrub
Patch-Name: tools-pygrub-remove-static-solaris-support
Gbp-Pq: Topic misc
Gbp-Pq: Name tools-pygrub-remove-static-solaris-support
Bastian Blank [Sat, 5 Jul 2014 09:47:31 +0000 (11:47 +0200)]
tools-xenmon-install.diff
Patch-Name: tools-xenmon-install.diff
Gbp-Pq: Topic misc
Gbp-Pq: Name tools-xenmon-install.diff
Bastian Blank [Sat, 5 Jul 2014 09:47:30 +0000 (11:47 +0200)]
Do not ship COPYING into /usr/include
This is not wanted in Debian. COPYING ends up in
/usr/share/doc/xen-*copyright.
Patch-Name: tools-include-no-COPYING.diff
Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
Bastian Blank [Sat, 5 Jul 2014 09:46:45 +0000 (11:46 +0200)]
config-prefix.diff
Patch-Name: config-prefix.diff
Gbp-Pq: Topic prefix-abiname
Gbp-Pq: Name config-prefix.diff
Bastian Blank [Sat, 5 Jul 2014 09:46:43 +0000 (11:46 +0200)]
version
Patch-Name: version.diff
Gbp-Pq: Topic misc
Gbp-Pq: Name version.diff
Marek Marczykowski-Górecki [Thu, 5 Apr 2018 01:50:55 +0000 (03:50 +0200)]
tools/kdd: mute spurious gcc warning
gcc-8 complains:
kdd.c:698:13: error: 'memcpy' offset [-204, -717] is out of the bounds [0, 216] of object 'ctrl' with type 'kdd_ctrl' {aka 'union <anonymous>'} [-Werror=array-bounds]
memcpy(buf, ((uint8_t *)&ctrl.c32) + offset, len);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
kdd.c: In function 'kdd_select_callback':
kdd.c:642:14: note: 'ctrl' declared here
kdd_ctrl ctrl;
^~~~
But this is impossible - 'offset' is unsigned and correctly validated
few lines before.
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Release-Acked-by: Juergen Gross <jgross@suse.com>
(cherry picked from commit
437e00fea04becc91c1b6bc1c0baa636b067a5cc)
Christopher Clark [Thu, 16 Aug 2018 20:22:41 +0000 (13:22 -0700)]
libxl/arm: Fix build on arm64 + acpi w/ gcc 8.2
Add zero-padding to #defined ACPI table strings that are copied.
Provides sufficient characters to satisfy the length required to
fully populate the destination and prevent array-bounds warnings.
Add BUILD_BUG_ON sizeof checks for compile-time length checking.
Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Wei Liu <wei.liu2@citrix.com>
(cherry picked from commit
b8f33431f3dd23fb43a879f4bdb4283fdc9465ad)
Andrew Cooper [Wed, 4 Jul 2018 13:32:31 +0000 (14:32 +0100)]
tools: Move ARRAY_SIZE() into xen-tools/libs.h
xen-tools/libs.h currently contains a shared BUILD_BUG_ON() implementation and
is used by some tools. Extend this to include ARRAY_SIZE and clean up all the
opencoding.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
(cherry picked from commit
e1b7eb92d3ec6ce3ca68cffb36a148eb59f59613)
Wei Liu [Thu, 26 Jul 2018 14:58:54 +0000 (15:58 +0100)]
xenpmd: make 32 bit gcc 8.1 non-debug build work
32 bit gcc 8.1 non-debug build yields:
xenpmd.c:354:23: error: '%02x' directive output may be truncated writing between 2 and 8 bytes into a region of size 3 [-Werror=format-truncation=]
snprintf(val, 3, "%02x",
^~~~
xenpmd.c:354:22: note: directive argument in the range [40,
2147483778]
snprintf(val, 3, "%02x",
^~~~~~
xenpmd.c:354:5: note: 'snprintf' output between 3 and 9 bytes into a destination of size 3
snprintf(val, 3, "%02x",
^~~~~~~~~~~~~~~~~~~~~~~~
(unsigned int)(9*4 +
~~~~~~~~~~~~~~~~~~~~
strlen(info->model_number) +
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
strlen(info->serial_number) +
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
strlen(info->battery_type) +
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
strlen(info->oem_info) + 4));
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
All info->* used in calculation are 32 bytes long, and the parsing
code makes sure they are null-terminated, so the end result of the
expression won't exceed 255, which should be able to be fit into 3
bytes in hexadecimal format.
Add an assertion to make gcc happy.
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
(cherry picked from commit
e75c9dc85fdeeeda0b98d8cd8d784e0508c3ffb8)
Ian Jackson [Wed, 19 Sep 2018 15:53:22 +0000 (16:53 +0100)]
Delete configure output
These autogenerated files are not useful in Debian; dh_autoreconf will
regenerate them.
If this patch does not apply when rebasing, you can simply delete the
files again.
Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
Ian Jackson [Wed, 19 Sep 2018 15:45:49 +0000 (16:45 +0100)]
Delete config.sub and config.guess
dh_autoreconf will provide these back.
If this patch does not apply when rebasing, you can simply delete the
files again.
Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
Bastian Blank [Sat, 5 Jul 2014 09:47:36 +0000 (11:47 +0200)]
tools-xenstore-compatibility.diff
Patch-Name: tools-xenstore-compatibility.diff
Gbp-Pq: Topic xenstore
Gbp-Pq: Name tools-xenstore-compatibility.diff
Debian Xen Team [Fri, 24 Aug 2018 17:45:17 +0000 (18:45 +0100)]
tools-fake-xs-restrict
Gbp-Pq: Topic xenstore
Gbp-Pq: Name tools-fake-xs-restrict.patch
Ian Jackson [Fri, 28 Sep 2018 14:30:54 +0000 (15:30 +0100)]
tools/debugger/kdd: Install as `xen-kdd', not just `kdd'
`kdd' is an unfortunate namespace landgrab.
Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
Ian Jackson [Fri, 28 Sep 2018 14:27:21 +0000 (15:27 +0100)]
xenmon: Install as xenmon, not xenmon.py
Adding the implementation language as a suffix to a program name is
poor practice.
Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
Ian Jackson [Thu, 4 Oct 2018 11:32:00 +0000 (12:32 +0100)]
pygrub fsimage.so: Honour LDFLAGS when building
This seems to have been simply omitted. Obviously this is needed when
building and not just when installing. Passing only when installing
is ineffective.
Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
Ian Jackson [Thu, 4 Oct 2018 11:31:25 +0000 (12:31 +0100)]
libfsimage: Honour general LDFLAGS
Do not reset LDFLAGS to empty. Instead, append the fsimage-special
LDFLAGS.
Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
Ian Jackson [Thu, 4 Oct 2018 11:30:37 +0000 (12:30 +0100)]
gdbsx: Honour LDFLAGS when linking
This command does the link, so it needs LDFLAGS.
Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
Bastian Blank [Sat, 5 Jul 2014 09:46:50 +0000 (11:46 +0200)]
tools/xenstat: Fix shared library version
libxenstat does not have a stable ABI. Set its version to the current
Xen release version.
Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
Ian Jackson [Wed, 3 Oct 2018 17:43:55 +0000 (18:43 +0100)]
docs/man/xen-pv-channel.pod.7: Remove a spurious blank line
No functional change.
Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
Ian Jackson [Wed, 3 Oct 2018 17:42:42 +0000 (18:42 +0100)]
docs/man: Provide properly-formatted NAME sections
A manpage `foo.7.pod' must start with
=head NAME
foo - some summary of what foo is or what this manpage is
because otherwise manpage catalogue systems cannot generate a proper
`whatis' entry.
Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
Ian Jackson [Fri, 21 Sep 2018 14:40:19 +0000 (15:40 +0100)]
INSTALL: Mention kconfig
Firstly, add a reference to the documentation for the kconfig system.
Secondly, warn the user about the XEN_CONFIG_EXPERT problem.
CC: Doug Goldstein <cardoe@cardoe.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
Ian Jackson [Fri, 5 Oct 2018 16:52:54 +0000 (17:52 +0100)]
tools/Rules.mk: Honour PREPEND_LDFLAGS_XEN_TOOLS
This allows the caller to provide some LDFLAGS to the Xen build
system.
Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
Christopher Clark [Wed, 18 Jul 2018 22:22:17 +0000 (15:22 -0700)]
tools/xentop : replace use of deprecated vwprintw
gcc-8.1 complains:
| xentop.c: In function 'print':
| xentop.c:304:4: error: 'vwprintw' is deprecated [-Werror=deprecated-declarations]
| vwprintw(stdscr, (curses_str_t)fmt, args);
| ^~~~~~~~
vw_printw (note the underscore) is a non-deprecated alternative.
Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
Gbp-Pq: Topic misc
Gbp-Pq: Name tools-xentop-replace-use-of-deprecated-vwprintw.patch
Ian Jackson [Wed, 3 Oct 2018 18:00:22 +0000 (19:00 +0100)]
Various: Fix typo `mappping'
Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
Ian Jackson [Wed, 3 Oct 2018 17:59:18 +0000 (18:59 +0100)]
Various: Fix typo `infomation'
Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
Ian Jackson [Wed, 3 Oct 2018 17:57:13 +0000 (18:57 +0100)]
tools/python/xen/lowlevel: Fix typo `sucess'
Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
Ian Jackson [Wed, 3 Oct 2018 17:56:39 +0000 (18:56 +0100)]
Various: Fix typo `reseting'
Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
Ian Jackson [Wed, 3 Oct 2018 17:55:36 +0000 (18:55 +0100)]
Various: Fix typo `occured'
Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
Ian Jackson [Wed, 3 Oct 2018 17:51:50 +0000 (18:51 +0100)]
Various: Fix typos `unkown', `retreive' (detected by lintian)
Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
Ian Jackson [Wed, 3 Oct 2018 17:46:47 +0000 (18:46 +0100)]
tools/xentrace/xenalyze: Fix typos detected by lintian
Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
Ian Jackson [Wed, 3 Oct 2018 17:44:18 +0000 (18:44 +0100)]
docs/man: Fix two typos detected by the Debian lintian tool
Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
Hans van Kranenburg [Mon, 13 May 2019 19:54:56 +0000 (21:54 +0200)]
Update changelog for new upstream 4.11.1+
58-g3b062f5040
[git-debrebase changelog: new upstream 4.11.1+
58-g3b062f5040]
Hans van Kranenburg [Mon, 13 May 2019 19:54:56 +0000 (21:54 +0200)]
Update to upstream 4.11.1+
58-g3b062f5040
[git-debrebase anchor: new upstream 4.11.1+
58-g3b062f5040, merge]
Ian Jackson [Thu, 28 Feb 2019 16:38:37 +0000 (16:38 +0000)]
finalise 4.11.1+
26-g87f51bf366-3
Andrew Cooper [Fri, 3 May 2019 08:55:55 +0000 (10:55 +0200)]
x86/spec-ctrl: Extend repoline safey calcuations for eIBRS and Atom parts
All currently-released Atom processors are in practice retpoline-safe, because
they don't fall back to a BTB prediction on RSB underflow.
However, an additional meaning of Enhanced IRBS is that the processor may not
be retpoline-safe. The Gemini Lake platform, based on the Goldmont Plus
microarchitecture is the first Atom processor to support eIBRS.
Until Xen gets full eIBRS support, Gemini Lake will still be safe using
regular IBRS.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
master commit:
17f74242ccf0ce6e51c03a5860947865c0ef0dc2
master date: 2019-03-18 16:26:40 +0000
Andrew Cooper [Fri, 3 May 2019 08:55:10 +0000 (10:55 +0200)]
x86/msr: Shorten ARCH_CAPABILITIES_* constants
They are unnecesserily verbose, and ARCH_CAPS_* is already the more common
version.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
master commit:
ba27aaa88548c824a47dcf5609288ee1c05d2946
master date: 2019-03-18 16:26:40 +0000
Jan Beulich [Fri, 3 May 2019 08:53:40 +0000 (10:53 +0200)]
x86/e820: fix build with gcc9
e820.c: In function ‘clip_to_limit’:
.../xen/include/asm/string.h:10:26: error: ‘__builtin_memmove’ offset [-16, -36] is out of the bounds [0, 20484] of object ‘e820’ with type ‘struct e820map’ [-Werror=array-bounds]
10 | #define memmove(d, s, n) __builtin_memmove(d, s, n)
| ^~~~~~~~~~~~~~~~~~~~~~~~~~
e820.c:404:13: note: in expansion of macro ‘memmove’
404 | memmove(&e820.map[i], &e820.map[i+1],
| ^~~~~~~
e820.c:36:16: note: ‘e820’ declared here
36 | struct e820map e820;
| ^~~~
While I can't see where the negative offsets would come from, converting
the loop index to unsigned type helps. Take the opportunity and also
convert several other local variables and copy_e820_map()'s second
parameter to unsigned int (and bool in one case).
Reported-by: Charles Arnold <carnold@suse.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
master commit:
22e2f8dddf5fbed885b5e4db3ffc9e1101be9ec0
master date: 2019-03-18 11:38:36 +0100
Andrew Cooper [Fri, 3 May 2019 08:52:32 +0000 (10:52 +0200)]
xen: Fix backport of "xen/cmdline: Fix buggy strncmp(s, LITERAL, ss - s) construct"
These were missed as a consequence of being rebased over other cmdline
cleanup.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Fri, 3 May 2019 08:51:31 +0000 (10:51 +0200)]
xen: Fix backport of "x86/tsx: Implement controls for RTM force-abort mode"
The posted version of this patch depends on c/s
3c555295 "x86/vpmu: Improve
documentation and parsing for vpmu=" (Xen 4.12 and later) to prevent
`vpmu=rtm-abort` impliying `vpmu=1`, which is outside of security support.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Wei Liu [Wed, 28 Nov 2018 17:43:33 +0000 (17:43 +0000)]
tools/firmware: update OVMF Makefile, when necessary
[ This is two commits from master aka staging-4.12: ]
OVMF has become dependent on OpenSSL, which is included as a
submodule. Initialise submodules before building.
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
(cherry picked from commit
b16281870e06f5f526029a4e69634a16dc38e8e4)
tools: only call git when necessary in OVMF Makefile
Users may choose to export a snapshot of OVMF and build it
with xen.git supplied ovmf-makefile. In that case we don't
need to call `git submodule`.
Fixes
b16281870e.
Reported-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
Release-acked-by: Juergen Gross <jgross@suse.com>
(cherry picked from commit
68292c94a60eab24514ab4a8e4772af24dead807)
Jan Beulich [Tue, 12 Mar 2019 13:42:17 +0000 (14:42 +0100)]
Arm/atomic: correct asm() constraints in build_add_sized()
The memory operand is an in/out one, and the auxiliary register gets
written to early.
Take the opportunity and also drop the redundant cast (the inline
functions' parameters are already of the casted-to type).
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Julien Grall <julien.grall@arm.com>
(cherry picked from commit
51ceb1623b9956440f1b9943c67010a90d61f5c5)
Andrew Cooper [Mon, 18 Mar 2019 16:09:08 +0000 (17:09 +0100)]
x86/pv: Fix construction of 32bit dom0's
dom0_construct_pv() has logic to transition dom0 into a compat domain when
booting an ELF32 image.
One aspect which is missing is the CPUID policy recalculation, meaning that a
32bit dom0 sees a 64bit policy, which differ by the Long Mode feature flag in
particular. Another missing item is the x87_fip_width initialisation.
Update dom0_construct_pv() to use switch_compat(), rather than retaining the
opencoding. Position the call to switch_compat() such that the compat32 local
variable can disappear entirely.
The 32bit monitor table is now created by setup_compat_l4(), avoiding the need
to for manual creation later.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
master commit:
356f437171c5bb90701ac9dd7ba4dbbd05988e38
master date: 2019-03-15 14:59:27 +0000
Andrew Cooper [Mon, 18 Mar 2019 16:08:25 +0000 (17:08 +0100)]
x86/tsx: Implement controls for RTM force-abort mode
The CPUID bit and MSR are deliberately not exposed to guests, because they
won't exist on newer processors. As vPMU isn't security supported, the
misbehaviour of PCR3 isn't expected to impact production deployments.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
master commit:
6be613f29b4205349275d24367bd4c82fb2960dd
master date: 2019-03-12 17:05:21 +0000
Andrew Cooper [Mon, 18 Mar 2019 16:07:45 +0000 (17:07 +0100)]
x86/vtd: Don't include control register state in the table pointers
iremap_maddr and qinval_maddr point to the base of a block of contiguous RAM,
allocated by the driver, holding the Interrupt Remapping table, and the Queued
Invalidation ring.
Despite their name, they are actually the values of the hardware register,
including control metadata in the lower 12 bits. While uses of these fields
do appear to correctly shift out the metadata, this is very subtle behaviour
and confusing to follow.
Nothing uses the metadata, so make the fields actually point at the base of
the relevant tables.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
master commit:
a9a05aeee10a5a3763a41305a9f38112dd1fcc82
master date: 2019-03-12 13:57:13 +0000
Jan Beulich [Mon, 18 Mar 2019 16:07:11 +0000 (17:07 +0100)]
x86/HVM: don't crash guest in hvmemul_find_mmio_cache()
Commit
35a61c05ea ("x86emul: adjust handling of AVX2 gathers") builds
upon the fact that the domain will actually survive running out of MMIO
result buffer space. Drop the domain_crash() invocation. Also delay
incrementing of the usage counter, such that the function can't possibly
use/return an out-of-bounds slot/pointer in case execution subsequently
makes it into the function again without a prior reset of state.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
master commit:
a43c1dec246bdee484e6a3de001cc6850a107abe
master date: 2019-03-12 14:39:46 +0100
Igor Druzhinin [Mon, 18 Mar 2019 16:06:37 +0000 (17:06 +0100)]
iommu: leave IOMMU enabled by default during kexec crash transition
It's unsafe to disable IOMMU on a live system which is the case
if we're crashing since remapping hardware doesn't usually know what
to do with ongoing bus transactions and frequently raises NMI/MCE/SMI,
etc. (depends on the firmware configuration) to signal these abnormalities.
This, in turn, doesn't play well with kexec transition process as there is
no handling available at the moment for this kind of events resulting
in failures to enter the kernel.
Modern Linux kernels taught to copy all the necessary DMAR/IR tables
following kexec from the previous kernel (Xen in our case) - so it's
currently normal to keep IOMMU enabled. It might require minor changes to
kdump command line that enables IOMMU drivers (e.g. intel_iommu=on /
intremap=on) but recent kernels don't require any additional changes for
the transition to be transparent.
A fallback option is still left for compatibility with ancient crash
kernels which didn't like to have IOMMU active under their feet on boot.
Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
master commit:
12c36f577d454996c882ecdc5da8113ca2613646
master date: 2019-03-12 14:38:12 +0100
Jan Beulich [Mon, 18 Mar 2019 16:05:45 +0000 (17:05 +0100)]
x86/cpuid: add missing PCLMULQDQ dependency
Since we can't seem to be able to settle our discussion for the wider
adjustment previously posted, let's at least add the missing dependency
for 4.12. I'm not convinced though that attaching it to SSE is correct.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
master commit:
eeb31ee522c7bb8541eb4c037be2c42bfcf0a3c3
master date: 2019-03-05 18:04:23 +0100
Jan Beulich [Mon, 18 Mar 2019 16:05:07 +0000 (17:05 +0100)]
x86/mm: fix #GP(0) in switch_cr3_cr4()
With "pcid=no-xpti" and opposite XPTI settings in two 64-bit PV domains
(achievable with one of "xpti=no-dom0" or "xpti=no-domu"), switching
from a PCID-disabled to a PCID-enabled 64-bit PV domain fails to set
CR4.PCIDE in time, as CR4.PGE would not be set in either (see
pv_fixup_guest_cr4(), in particular as used by write_ptbase()), and
hence the early CR4 write would be skipped.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
master commit:
fdc2056767ba74346dfd8bbe868bb22521ba1418
master date: 2019-03-05 17:02:36 +0100
Igor Druzhinin [Mon, 18 Mar 2019 16:04:30 +0000 (17:04 +0100)]
x86/nmi: correctly check MSB of P6 performance counter MSR in watchdog
The logic currently tries to work out if a recent overflow (that indicates
that NMI comes from the watchdog) happened by checking MSB of performance
counter MSR that is initially sign extended from a negative value
that we program it to. A possibly incorrect assumption here is that
MSB is always bit 32 while on modern hardware it's usually 47 and
the actual bit-width is reported through CPUID. Checking bit 32 for
overflows is usually fine since we never program it to anything
exceeding 32-bits and NMI is handled shortly after overflow occurs.
A problematic scenario that we saw occurs on systems where SMIs taking
significant time are possible. In that case, NMI handling is deferred to
the point firmware exits SMI which might take enough time for the counter
to go through bit 32 and set it to 1 again. So the logic described above
will misread it and report an unknown NMI erroneously.
Fortunately, we can use the actual MSB, which is usually higher than the
currently hardcoded 32, and treat this case correctly at least on modern
hardware.
Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
master commit:
0452d02b6e7849537914dd30cbfc8eb27cdad2ce
master date: 2019-02-28 13:44:40 +0000
Andrew Cooper [Mon, 18 Mar 2019 16:03:41 +0000 (17:03 +0100)]
x86/hvm: Increase the triple fault log message level to XENLOG_ERR
At INFO level, it doesn't get printed out by default in release builds,
leading to unqualified logging such as this:
(XEN) [ 66.995993] Freed 524kB init memory
(XEN) [ 1993.144997] *** Dumping Dom9 vcpu#2 state: ***
(XEN) [ 1993.145008] ----[ Xen-4.11.1 x86_64 debug=n Not tainted ]----
(XEN) [ 1993.145011] CPU: 21
(XEN) [ 1993.145015] RIP: 0010:[<
ffffe0002ba950ef>]
(XEN) [ 1993.145018] RFLAGS:
0000000000010246 CONTEXT: hvm guest (d9v2)
(XEN) [ 1993.145026] rax:
00000000ffffe000 rbx:
ffffe0002d8e1440 rcx:
0000ffffe0002ba9
(XEN) [ 1993.145031] rdx:
0000000000000000 rsi:
ffffe0002ba93575 rdi:
fffff803dfb9f340
(XEN) [ 1993.145035] rbp:
ffffd001cd791200 rsp:
ffffd001cd791140 r8:
0000000000000130
(XEN) [ 1993.145039] r9:
0000000080000000 r10:
0000000000000000 r11:
0000000000000020
(XEN) [ 1993.145043] r12:
ffffe0002ba9306d r13:
0000000000000000 r14:
0000000000000001
(XEN) [ 1993.145047] r15:
fffff803dfb9f200 cr0:
0000000080050031 cr4:
0000000000170678
(XEN) [ 1993.145051] cr3:
00000000001aa002 cr2:
0000020488403f70
(XEN) [ 1993.145056] fsb:
0000000060f71000 gsb:
ffffd001cc1af000 gss:
0000009d60f6f000
(XEN) [ 1993.145060] ds: 002b es: 002b fs: 0053 gs: 002b ss: 0018 cs: 0010
A triple fault is fatal to the domain under all circumstances (so will print
at most once), and in practice is always an error condition rather than a
reboot fallback.
Reported-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
master commit:
1c8ca185e3c6e003398471edd9dbac0cd118137c
master date: 2019-02-28 11:16:27 +0000
Andrew Cooper [Mon, 18 Mar 2019 16:02:49 +0000 (17:02 +0100)]
x86/vmx: Properly flush the TLB when an altp2m is modified
Modifications to an altp2m mark the p2m as needing flushing, but this was
never wired up in the return-to-guest path. As a result, stale TLB entries
can remain after resuming the guest.
In practice, this manifests as a missing EPT_VIOLATION or #VE exception when
the guest subsequently accesses a page which has had its permissions reduced.
vmx_vmenter_helper() now has 11 p2ms to potentially invalidate, but issuing 11
INVEPT instructions isn't clever. Instead, count how many contexts need
invalidating, and use INVEPT_ALL_CONTEXT if two or more are in need of
flushing.
This doesn't have an XSA because altp2m is not yet a security-supported
feature.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
master commit:
69f7643df68ef8e994221a996e336a47cbb7bbc8
master date: 2019-02-28 11:16:27 +0000
Jan Beulich [Mon, 18 Mar 2019 16:02:02 +0000 (17:02 +0100)]
x86/shadow: don't use map_domain_page_global() on paths that may not fail
The assumption (according to one comment) and hope (according to
another) that map_domain_page_global() can't fail are both wrong on
large enough systems. Do away with the guest_vtable field altogether,
and establish / tear down the desired mapping as necessary.
The alternatives, discarded as being undesirable, would have been to
either crash the guest in sh_update_cr3() when the mapping fails, or to
bubble up an error indicator, which upper layers would have a hard time
to deal with (other than again by crashing the guest).
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Tim Deegan <tim@xen.org>
master commit:
43282a5e64da26fad544e0100abf35048cf65b46
master date: 2019-02-26 16:56:26 +0100
Paul Durrant [Mon, 18 Mar 2019 16:01:21 +0000 (17:01 +0100)]
viridian: fix the HvFlushVirtualAddress/List hypercall implementation
The current code uses hvm_asid_flush_vcpu() but this is insufficient for
a guest running in shadow mode, which results in guest crashes early in
boot if the 'hcall_remote_tlb_flush' is enabled.
This patch, instead of open coding a new flush algorithm, adapts the one
already used by the HVMOP_flush_tlbs Xen hypercall. The implementation is
modified to allow TLB flushing a subset of a domain's vCPUs. A callback
function determines whether or not a vCPU requires flushing. This mechanism
was chosen because, while it is the case that the currently implemented
viridian hypercalls specify a vCPU mask, there are newer variants which
specify a sparse HV_VP_SET and thus use of a callback will avoid needing to
expose details of this outside of the viridian subsystem if and when those
newer variants are implemented.
NOTE: Use of the common flush function requires that the hypercalls are
restartable and so, with this patch applied, viridian_hypercall()
can now return HVM_HCALL_preempted. This is safe as no modification
to struct cpu_user_regs is done before the return.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
master commit:
ce98ee3050a824994ce4957faa8f53ecb8c7da9d
master date: 2019-02-26 16:55:06 +0100
Jan Beulich [Mon, 18 Mar 2019 16:00:28 +0000 (17:00 +0100)]
x86/shadow: don't pass wrong L4 MFN to guest_walk_tables()
64-bit PV guest user mode runs on a different L4 table. Make sure
- the accessed bit gets set in the correct table (and in log-dirty
mode the correct page gets marked dirty) during guest walks,
- the correct table gets audited by sh_audit_gw(),
- correct info gets logged by print_gw().
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
master commit:
db2af23d15077605f286d8ef86c8f5d9c1b8302a
master date: 2019-02-20 17:07:17 +0100
Varad Gautam [Mon, 18 Mar 2019 15:59:43 +0000 (16:59 +0100)]
x86/pmtimer: fix hvm_acpi_sleep_button behavior
Commit
19fb14622e941 "x86/pmtimer: move ACPI registers from PMTState to
hvm_domain" misconfigures pm1a_sts for hvm_acpi_sleep_button with
PWRBTN_STS instead of SLPBTN_STS, which leads to
XEN_DOMCTL_SENDTRIGGER_SLEEP causing guest powerdowns. Fix this.
Signed-off-by: Varad Gautam <vrd@amazon.de>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
master commit:
b22c900c44a2db8db1c53e269e152206e55c273f
master date: 2019-02-20 17:06:25 +0100
Jan Beulich [Tue, 5 Mar 2019 14:06:47 +0000 (15:06 +0100)]
x86/pv: _toggle_guest_pt() may not skip TLB flush for shadow mode guests
For shadow mode guests (e.g. PV ones forced into that mode as L1TF
mitigation, or during migration) update_cr3() -> sh_update_cr3() may
result in a change to the (shadow) root page table (compared to the
previous one when running the same vCPU with the same PCID). This can,
first and foremost, be a result of memory pressure on the shadow memory
pool of the domain. Shadow code legitimately relies on the original
(prior to commit
5c81d260c2 ["xen/x86: use PCID feature"]) behavior of
the subsequent CR3 write to flush the TLB of entries still left from
walks with an earlier, different (shadow) root page table.
Restore the flushing behavior, also for the second CR3 write on the exit
path to guest context when XPTI is active. For the moment accept that
this will introduce more flushes than are strictly necessary - no flush
would be needed when the (shadow) root page table doesn't actually
change, but this information isn't readily (i.e. without introducing a
layering violation) available here.
This is XSA-294.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Tested-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
master commit:
329b00e4d49f70185561d7cc4b076c77869888a0
master date: 2019-03-05 13:54:42 +0100
Andrew Cooper [Tue, 5 Mar 2019 14:05:50 +0000 (15:05 +0100)]
x86/pv: Don't have %cr4.fsgsbase active behind a guest kernels back
Currently, a 64bit PV guest can appear to set and clear FSGSBASE in %cr4, but
the bit remains set in hardware. Therefore, the {RD,WR}{FS,GS}BASE are usable
even when the guest kernel believes that they are disabled.
The FSGSBASE feature isn't currently supported in Linux, and its context
switch path has some optimisations which rely on userspace being unable to use
the WR{FS,GS}BASE instructions. Xen's current behaviour undermines this
expectation.
In 64bit PV guest context, always load the guest kernels setting of FSGSBASE
into %cr4. This requires adjusting how Xen uses the {RD,WR}{FS,GS}BASE
instructions.
* Delete the cpu_has_fsgsbase helper. It is no longer safe, as users need to
check %cr4 directly.
* The raw __rd{fs,gs}base() helpers are only safe to use when %cr4.fsgsbase
is set. Comment this property.
* The {rd,wr}{fs,gs}{base,shadow}() and read_msr() helpers are updated to use
the current %cr4 value to determine which mechanism to use.
* toggle_guest_mode() and save_segments() are update to avoid reading
fs/gsbase if the values in hardware cannot be stale WRT struct vcpu. A
consequence of this is that the write_cr() path needs to cache the current
bases, as subsequent context switches will skip saving the values.
* write_cr4() is updated to ensure that the shadow %cr4.fsgsbase value is
observed in a safe way WRT the hardware setting, if an interrupt happens to
hit in the middle.
* load_segments() is updated to use the VMLOAD optimisation if FSGSBASE is
unavailable, even if only gs_shadow needs updating. As a minor perf
improvement, check cpu_has_svm first to short circuit a context-dependent
conditional on Intel hardware.
* pv_make_cr4() is updated for 64bit PV guests to use the guest kernels
choice of FSGSBASE.
This is part of XSA-293.
Reported-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
master commit:
eccc170053e46b4ab1d9e7485c09e210be15bbd7
master date: 2019-03-05 13:54:05 +0100
Andrew Cooper [Tue, 5 Mar 2019 14:04:54 +0000 (15:04 +0100)]
x86/pv: Rewrite guest %cr4 handling from scratch
The PV cr4 logic is almost impossible to follow, and leaks bits into guest
context which definitely shouldn't be visible (in particular, VMXE).
The biggest problem however, and source of the complexity, is that it derives
new real and guest cr4 values from the current value in hardware - this is
context dependent and an inappropriate source of information.
Rewrite the cr4 logic to be invariant of the current value in hardware.
First of all, modify write_ptbase() to always use mmu_cr4_features for IDLE
and HVM contexts. mmu_cr4_features *is* the correct value to use, and makes
the ASSERT() obviously redundant.
For PV guests, curr->arch.pv.ctrlreg[4] remains the guests view of cr4, but
all logic gets reworked in terms of this and mmu_cr4_features only.
Two masks are introduced; bits which the guest has control over, and bits
which are forwarded from Xen's settings. One guest-visible change here is
that Xen's VMXE setting is no longer visible at all.
pv_make_cr4() follows fairly closely from pv_guest_cr4_to_real_cr4(), but
deliberately starts with mmu_cr4_features, and only alters the minimal subset
of bits.
The boot-time {compat_,}pv_cr4_mask variables are removed, as they are a
remnant of the pre-CPUID policy days. pv_fixup_guest_cr4() gains a related
derivation from the policy.
Another guest visible change here is that a 32bit PV guest can now flip
FSGSBASE in its view of CR4. While the {RD,WR}{FS,GS}BASE instructions are
unusable outside of a 64bit code segment, the ability to modify FSGSBASE
matches real hardware behaviour, and avoids the need for any 32bit/64bit
differences in the logic.
Overall, this patch shouldn't have a practical change in guest behaviour.
VMXE will disappear from view, and an inquisitive 32bit kernel can now see
FSGSBASE changing, but this new logic is otherwise bug-compatible with before.
This is part of XSA-293.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
master commit:
b2dd00574a4fc87ca964177f8e752a968c27efb2
master date: 2019-03-05 13:53:32 +0100
Jan Beulich [Tue, 5 Mar 2019 14:04:21 +0000 (15:04 +0100)]
x86/mm: properly flush TLB in switch_cr3_cr4()
The CR3 values used for contexts run with PCID enabled uniformly have
CR3.NOFLUSH set, resulting in the CR3 write itself to not cause any
flushing at all. When the second CR4 write is skipped or doesn't do any
flushing, there's nothing so far which would purge TLB entries which may
have accumulated again if the PCID doesn't change; the "just in case"
flush only affects the case where the PCID actually changes. (There may
be particularly many TLB entries re-accumulated in case of a watchdog
NMI kicking in during the critical time window.)
Suppress the no-flush behavior of the CR3 write in this particular case.
Similarly the second CR4 write may not cause any flushing of TLB entries
established again while the original PCID was still in use - it may get
performed because of unrelated bits changing. The flush of the old PCID
needs to happen nevertheless.
At the same time also eliminate a possible race with lazy context
switch: Just like for CR4, CR3 may change at any time while interrupts
are enabled, due to the __sync_local_execstate() invocation from the
flush IPI handler. It is for that reason that the CR3 read, just like
the CR4 one, must happen only after interrupts have been turned off.
This is XSA-292.
Reported-by: Sergey Dyasli <sergey.dyasli@citrix.com>
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Tested-by: Sergey Dyasli <sergey.dyasli@citrix.com>
master commit:
6e5f22ba437d78c0a84b9673f7e2cfefdbc62f4b
master date: 2019-03-05 13:52:44 +0100
Jan Beulich [Tue, 5 Mar 2019 14:03:43 +0000 (15:03 +0100)]
x86/mm: don't retain page type reference when IOMMU operation fails
The IOMMU update in _get_page_type() happens between recording of the
new reference and validation of the page for its new type (if
necessary). If the IOMMU operation fails, there's no point in actually
carrying out validation. Furthermore, with this resulting in failure
getting indicated to the caller, the recorded type reference also needs
to be dropped again.
Note that in case of failure of alloc_page_type() there's no need to
undo the IOMMU operation: Only special types get handed to the function.
The function, upon failure, clears ->u.inuse.type_info, effectively
converting the page to PGT_none. The IOMMU mapping, however, solely
depends on whether the type is PGT_writable_page.
This is XSA-291.
Reported-by: Igor Druzhinin <igor.druzhinin@citrix.com>
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
master commit:
fad0de986220c46e70be2f83279961aad7394af0
master date: 2019-03-05 13:52:15 +0100
Jan Beulich [Tue, 5 Mar 2019 14:02:19 +0000 (15:02 +0100)]
x86/mm: add explicit preemption checks to L3 (un)validation
When recursive page tables are used at the L3 level, unvalidation of a
single L4 table may incur unvalidation of two levels of L3 tables, i.e.
a maximum iteration count of 512^3 for unvalidating an L4 table. The
preemption check in free_l2_table() as well as the one in
_put_page_type() may never be reached, so explicit checking is needed in
free_l3_table().
When recursive page tables are used at the L4 level, the iteration count
at L4 alone is capped at 512^2. As soon as a present L3 entry is hit
which itself needs unvalidation (and hence requiring another nested loop
with 512 iterations), the preemption checks added here kick in, so no
further preemption checking is needed at L4 (until we decide to permit
5-level paging for PV guests).
The validation side additions are done just for symmetry.
This is part of XSA-290.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
master commit:
bac4567a67d5e8b916801ea5a04cf8b443dfb245
master date: 2019-03-05 13:51:46 +0100
Jan Beulich [Tue, 5 Mar 2019 14:01:39 +0000 (15:01 +0100)]
x86/mm: also allow L2 (un)validation to be fully preemptible
Commit
c612481d1c ("x86/mm: Plumbing to allow any PTE update to fail
with -ERESTART") added assertions next to the {alloc,free}_l2_table()
invocations to document (and validate in debug builds) that L2
(un)validations are always preemptible.
The assertion in free_page_type() was now observed to trigger when
recursive L2 page tables get cleaned up.
In particular put_page_from_l2e()'s assumption that _put_page_type()
would always succeed is now wrong, resulting in a partially un-validated
page left in a domain, which has no other means of getting cleaned up
later on. If not causing any problems earlier, this would ultimately
trigger the check for ->u.inuse.type_info having a zero count when
freeing the page during cleanup after the domain has died.
As a result it should be considered a mistake to not have extended
preemption fully to L2 when it was added to L3/L4 table handling, which
this change aims to correct.
The validation side additions are done just for symmetry.
This is part of XSA-290.
Reported-by: Manuel Bouyer <bouyer@antioche.eu.org>
Tested-by: Manuel Bouyer <bouyer@antioche.eu.org>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
master commit:
176ebf9c8bc2828f6637eb61cc1cf166e302c699
master date: 2019-03-05 13:51:18 +0100
George Dunlap [Tue, 5 Mar 2019 14:01:09 +0000 (15:01 +0100)]
xen: Make coherent PV IOMMU discipline
In order for a PV domain to set up DMA from a passed-through device to
one of its pages, the page must be mapped in the IOMMU. On the other
hand, before a PV page may be used as a "special" page type (such as a
pagetable or descriptor table), it _must not_ be writable in the IOMMU
(otherwise a malicious guest could DMA arbitrary page tables into the
memory, bypassing Xen's safety checks); and Xen's current rule is to
have such pages not in the IOMMU at all.
At the moment, in order to accomplish this, the code borrows HVM
domain's "physmap" concept: When a page is assigned to a guest,
guess_physmap_add_entry() is called, which for PV guests, will create
a writable IOMMU mapping; and when a page is removed,
guest_physmap_remove_entry() is called, which will remove the mapping.
Additionally, when a page gains the PGT_writable page type, the page
will be added into the IOMMU; and when the page changes away from a
PGT_writable type, the page will be removed from the IOMMU.
Unfortunately, borrowing the "physmap" concept from HVM domains is
problematic. HVM domains have a lock on their p2m tables, ensuring
synchronization between modifications to the p2m; and all hypercall
parameters must first be translated through the p2m before being used.
Trying to mix this locked-and-gated approach with PV's lock-free
approach leads to several races and inconsistencies:
* A race between a page being assigned and it being put into the
physmap; for example:
- P1: call populate_physmap() { A = allocate_domheap_pages() }
- P2: Guess page A's mfn, and call decrease_reservation(A). A is owned by the domain,
and so Xen will clear the PGC_allocated bit and free the page
- P1: finishes populate_physmap() { guest_physmap_add_entry() }
Now the domain has a writable IOMMU mapping to a page it no longer owns.
* Pages start out as type PGT_none, but with a writable IOMMU mapping.
If a guest uses a page as a page table without ever having created a
writable mapping, the IOMMU mapping will not be removed; the guest
will have a writable IOMMU mapping to a page it is currently using
as a page table.
* A newly-allocated page can be DMA'd into with no special actions on
the part of the guest; However, if a page is promoted to a
non-writable type, the page must be mapped with a writable type before
DMA'ing to it again, or the transaction will fail.
To fix this, do away with the "PV physmap" concept entirely, and
replace it with the following IOMMU discipline for PV guests:
- (type == PGT_writable) <=> in iommu (even if type_count == 0)
- Upon a final put_page(), check to see if type is PGT_writable; if so,
iommu_unmap.
In order to achieve that:
- Remove PV IOMMU related code from guest_physmap_*
- Repurpose cleanup_page_cacheattr() into a general
cleanup_page_mappings() function, which will both fix up Xen
mappings for pages with special cache attributes, and also check for
a PGT_writable type and remove pages if appropriate.
- For compatibility with current guests, grab-and-release a
PGT_writable_page type for PV guests in guest_physmap_add_entry().
This will cause most "normal" guest pages to start out life with
PGT_writable_page type (and thus an IOMMU mapping), but no type
count (so that they can be used as special cases at will).
Also, note that there is one exception to to the "PGT_writable => in
iommu" rule: xenheap pages shared with guests may be given a
PGT_writable type with one type reference. This reference prevents
the type from changing, which in turn prevents page from gaining an
IOMMU mapping in get_page_type(). It's not clear whether this was
intentional or not, but it's not something to change in a security
update.
This is XSA-288.
Reported-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
master commit:
fe21b78ef99a1b505cfb6d3789ede9591609dd70
master date: 2019-03-05 13:48:32 +0100
George Dunlap [Tue, 5 Mar 2019 14:00:29 +0000 (15:00 +0100)]
steal_page: Get rid of bogus struct page states
The original rules for `struct page` required the following invariants
at all times:
- refcount > 0 implies owner != NULL
- PGC_allocated implies refcount > 0
steal_page, in a misguided attempt to protect against unknown races,
violates both of these rules, thus introducing other races:
- Temporarily, the count_info has the refcount go to 0 while
PGC_allocated is set
- It explicitly returns the page PGC_allocated set, but owner == NULL
and page not on the page_list.
The second one meant that page_get_owner_and_reference() could return
NULL even after having successfully grabbed a reference on the page,
leading the caller to leak the reference (since "couldn't get ref" and
"got ref but no owner" look the same).
Furthermore, rather than grabbing a page reference to ensure that the
owner doesn't change under its feet, it appears to rely on holding
d->page_alloc lock to prevent this.
Unfortunately, this is ineffective: page->owner remains non-NULL for
some time after the count has been set to 0; meaning that it would be
entirely possible for the page to be freed and re-allocated to a
different domain between the page_get_owner() check and the count_info
check.
Modify steal_page to instead follow the appropriate access discipline,
taking the page through series of states similar to being freed and
then re-allocated with MEMF_no_owner:
- Grab an extra reference to make sure we don't race with anyone else
freeing the page
- Drop both references and PGC_allocated atomically, so that (if
successful), anyone else trying to grab a reference will fail
- Attempt to reset Xen's mappings
- Reset the rest of the state.
Then, modify the two callers appropriately:
- Leave count_info alone (it's already been cleared)
- Call free_domheap_page() directly if appropriate
- Call assign_pages() rather than open-coding a partial assign
With all callers to assign_pages() now passing in pages with the
type_info field clear, tighten the respective assertion there.
This is XSA-287.
Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
master commit:
3d4868a481eebed232eeacba36ea28e5dee5e946
master date: 2019-03-05 13:48:08 +0100
Jan Beulich [Tue, 5 Mar 2019 13:59:48 +0000 (14:59 +0100)]
IOMMU/x86: fix type ref-counting race upon IOMMU page table construction
When arch_iommu_populate_page_table() gets invoked for an already
running guest, simply looking at page types once isn't enough, as they
may change at any time. Add logic to re-check the type after having
mapped the page, unmapping it again if needed.
This is XSA-285.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Tentatively-Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
master commit:
1f0b0bb7773d537bcf169e021495d0986d9809fc
master date: 2019-03-05 13:47:36 +0100
Jan Beulich [Tue, 5 Mar 2019 13:58:42 +0000 (14:58 +0100)]
gnttab: set page refcount for copy-on-grant-transfer
Commit
5cc77f9098 ("32-on-64: Fix domain address-size clamping,
implement"), which introduced this functionality, took care of clearing
the old page's PGC_allocated, but failed to set the bit (and install the
associated reference) on the newly allocated one. Furthermore the "mfn"
local variable was never updated, and hence the wrong MFN was passed to
guest_physmap_add_page() (and back to the destination domain) in this
case, leading to an IOMMU mapping into an unowned page.
Ideally the code would use assign_pages(), but the call to
gnttab_prepare_for_transfer() sits in the middle of the actions
mirroring that function.
This is XSA-284.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
master commit:
6d4f36c3fecc0a6a0991716199612c81d909316e
master date: 2019-03-05 13:45:58 +0100
Ian Jackson [Thu, 28 Feb 2019 17:04:37 +0000 (17:04 +0000)]
Declare fast forward / record previous work
[git-debrebase pseudomerge: stitch]
Ian Jackson [Thu, 28 Feb 2019 16:38:37 +0000 (16:38 +0000)]
finalise 4.11.1+
26-g87f51bf366-3
Ian Jackson [Thu, 28 Feb 2019 16:14:20 +0000 (16:14 +0000)]
Commit patch queue (exported by git-debrebase)
[git-debrebase make-patches: export and commit patches]
Ian Jackson [Wed, 27 Feb 2019 16:37:32 +0000 (16:37 +0000)]
/etc/default/xen: Handle with ucf
We reintroduced this file in
"d/xen-utils-common.install: ship /etc/default/xen"
eg
2f34db35dd27abb4280d38ebc4464c21f64df8c9
But I had forgotten that this file was previously shipped and handled
by ucf; and we have machinery to try to remove it.
This leaves the following possible cases:
(a) stretch: file handled by ucf
(b) older buster: file not shipped, ucf postinst action not done
| file remains recorded by ucf and ucfr, but that the original
| version of the file is no longer present on the user's machine
(c) newer buster: file shipped, file recorded by ucf with one
old version and as dpkg conffile by newer version
(a) and (b) will be handled correctly by just using ucf in the normal
way.
(c) xxx needs testing
So:
* Drop the special call to ucf-remove-fixup; retain the call to ucf
* Switch to shipping the file in /usr/share as expected by ucf
We do not remove the file's entry from not-installed - see the
comment relating to #831786.
Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
Hans van Kranenburg [Sun, 10 Feb 2019 17:26:45 +0000 (18:26 +0100)]
tools/xl/bash-completion: also complete 'xen'
We have the `xen` alias for xl in Debian, since in the past it was a
command that could execute either xl or xm.
Now, it always does xl, so, complete the same stuff for it as we have
for xl.
Signed-off-by: Hans van Kranenburg <hans@knorrie.org>
[git-debrebase split: mixed commit: upstream part]
Ian Jackson [Wed, 27 Feb 2019 16:45:00 +0000 (16:45 +0000)]
changelog: start 3~
Ian Jackson [Fri, 22 Feb 2019 12:24:35 +0000 (12:24 +0000)]
pygrub: Specify -rpath LIBEXEC_LIB when building fsimage.so
If LIBEXEC_LIB is not on the default linker search path, the python
fsimage.so module fails to find libfsimage.so.
Add the relevant directory to the rpath explicitly.
(This situation occurs in the Debian package, where
--with-libexec-libdir is used to put each Xen version's libraries and
utilities in their own directory, to allow them to be coinstalled.)
Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
Hans van Kranenburg [Sun, 10 Feb 2019 17:26:45 +0000 (18:26 +0100)]
tools/xl/bash-completion: also complete 'xen'
We have the `xen` alias for xl in Debian, since in the past it was a
command that could execute either xl or xm.
Now, it always does xl, so, complete the same stuff for it as we have
for xl.
Signed-off-by: Hans van Kranenburg <hans@knorrie.org>
[git-debrebase split: mixed commit: debian part]
Bastian Blank [Sat, 5 Jul 2014 09:47:01 +0000 (11:47 +0200)]
pygrub: Set sys.path
We install libfsimage in a non-standard path for Reasons.
(See debian/rules.)
This patch was originally part of `tools-pygrub-prefix.diff'
(eg commit
51657319be54) and included changes to the Makefile to
change the installation arrangements (we do that part in the rules now
since that is a lot less prone to conflicts when we update) and to
shared library rpath (which is now done in a separate patch).
(Commit message rewritten by Ian Jackson.)
Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
squash! pygrub: Set sys.path and rpath
Hans van Kranenburg [Sun, 24 Feb 2019 17:23:51 +0000 (18:23 +0100)]
d/not-installed: work around dh-exec bug
dh-exec breaks dh_missing by failing to register the installed files.
Put a workaround in place.
Closes: #923013
Signed-off-by: Hans van Kranenburg <hans@knorrie.org>
Ian Jackson [Thu, 21 Feb 2019 16:05:40 +0000 (16:05 +0000)]
hotplug-common: Do not adjust LD_LIBRARY_PATH
This is in the upstream script because on non-Debian systems, the
default install locations in /usr/local/lib might not be on the linker
path, and as a result the hotplug scripts would break.
A reason we might need it in Debian is our multiple version
coinstallation scheme. However, the hotplug scripts all call the
utilities via the wrappers, and the binaries are configured to load
from the right place anyway.
This setting is an annoyance because it requires libdir, which is an
arch-specific path but comes from a file we want to put in
xen-utils-common, an arch:all package.
So drop this setting.
Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
Hans van Kranenburg [Fri, 22 Feb 2019 16:06:29 +0000 (17:06 +0100)]
d/[..]/grub.d/xen.cfg: dom0_mem max IS needed
I misread the upstream documentation. Adding max: for x86 is actually
necessary to not have it "attempt to allocate enough page tables to
cover all of *host* RAM which can exhaust its actual memory allocation".
Signed-off-by: Hans van Kranenburg <hans@knorrie.org>
Hans van Kranenburg [Sat, 9 Feb 2019 16:27:26 +0000 (17:27 +0100)]
sysconfig.xencommons.in: Strip and debianize
Strip all options that are for stuff we don't ship, which is 1)
xenstored as stubdom and 2) xenbackendd, which seems to be dead code
anyway. [1]
It seems useful to give the user the option to revert to xenstored
instead of the default oxenstored if they really want.
[1] https://lists.xen.org/archives/html/xen-devel/2015-07/msg04427.html
Signed-off-by: Hans van Kranenburg <hans@knorrie.org>
Acked-by: Ian Jackson <ijackson@chiark.greenend.org.uk>